Fujitsu's PHOTON claims up to 475x over Transformers, but that's tokens/s/GiB (multi-query memory throughput), not faster single responses. What the 1.2B paper tables, the quality drop, and 9-query integration really show.
Sakana Fugu trains no base model: a learned conductor routes GPT-5.5/Claude/Gemini. How it compares to PLaMo (scratch, closed) and LLM-jp (fully open), how it differs from OpenRouter, and its biggest risk.
9 Japanese-specialized LLMs as of April 2026 — LLM-jp-4 (11.7T tokens from scratch), PLaMo, Nemotron Nano 9B JP (#1 sub-10B on Nejumi 4), Swallow 30B-A3B, Namazu — broken down by whether they were scratch-trained, continued pre-trained, or post-trained, with size, license, benchmark scores.
NVIDIA has released Nemotron-Nano-9B-v2-Japanese. It takes first place in the sub-10B category on Nejumi Leaderboard 4, delivering strong performance in Japanese knowledge, QA, and tool calling.